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ABSTRACT 

The 'Internal Linear Combination' (ILC) component separation method has been 
extensively used to extract a single component, the CMB, from the WMAP multi- 
frequency data. We generalise the ILC approach for separating other millimetre astro- 
physical emissions. We construct in particular a multidimensional ILC hltcr, which can 
be used, for instance, to estimate the diffuse emission of a complex component origi- 
nating from multiple correlated emissions, such as the total emission of the Galactic 
interstellar medium. The performance of such generalised ILC methods, implemented 
on a needlet frame, is tested on simulations of Planck mission observations, for which 
we successfully reconstruct a low noise estimate of emission from astrophysical fore- 
grounds with vanishing CMB and SZ contamination. 

Key words: Cosmic Background Radiation - Methods: data analysis - ISM: general 



1 INTRODUCTION 

The separation of emissions originating from distinct astro- 
physical components in observations of the millimetre and 
sub-millimetre sky is an important step in the scientific ex- 
ploitation of such observational data. Various methods have 
been developed to extract the emission of several compo- 
nents out of multi-frequency Cosmic Microwave Background 
(CMB) observations such as those of the WMAP and Planck 
space missions (see, e.g., Dclabrouille & Cardoso ( 2009]) for 
a review). 

In many cases, such methods define components 
through an (explicit or implicit) assumption that the ob- 
servations are a linear mixture of unknown templates (or 
sources) scaling rigidly with frequency, i.e.: 
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(1) 



Such methods also assume a fixed number of astrophys- 
ical emissions (e.g. CMB anisotropies, thermal Sunyaev- 
Zel'dovich (SZ) effect, thermal dust emission, synchrotron 
emission...). The rigid scaling of component emission with 
frequencies is imposed by the fact that the mixing coeffi- 
cients Aij depend solely on i and j (observation channel 
and component), and not on the pixel p. 

Assuming that such a representation holds, blind com- 
ponent separation methods such as the Spectral Match- 
ing ICA ( Delabrouille, Cardoso fc Patanchon| |2003 



doso et al. 20081, FastICA (Hyvarinen 1999; Maino etal 



2002), JADE (Cardoso 1998), CCA (Bonaldi et al. 2006 



Car- 
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or GMCA ( |Bobin et~aT| |2008[ ) are designed to solve the 



problem of recovering the components of interest when their 
mixing matrix (the matrix of mixing coefficients, which spec- 
ifies how much each component contributes to a frequency 
observation) is unknown. By exploiting the assumption of 
statistical independence between the components, the mix- 
ing matrix can be blindly estimated up to permutation and 
rescaling of its columns. Once an estimate of the mixing 
matrix is available, the components can be separated by in- 
verting the linear system, possibly taking into account the 
presence of instrumental noise. This has been investigated by 



a number of authors ( Tegmark & Efstathiou 1996 Bouchet 
fc Gispert[ [19991 |Hobson et al.| |1998| |Delabrouille, Patan- 



chon & Audit 



2002) 



However, in millimetre and sub-millimetre wave obser- 
vations, some components cannot be correctly modelled as 
a single template which would be simply scaled by mixing 
coefficients (Tegmark 19981. Emissions from the Galactic 



interstellar medium exhibit frequency scaling which depends 
on local conditions (temperature, chemical composition) at 
the location of emission, and hence are variable over the 
celestial sphere. 

Some of the blind component separation methods 
quoted above can take into account the possible variation 
of the foreground frequency scaling as a function of the ob- 
served pixel. The CCA method, for instance, can use a pixel- 
localized model of the foreground spectral indices. The Spec- 
tral Matching ICA can be (and has been) implemented on 
wavelet frames. All methods can be applied independently 
on several regions of the sky, allowing for a different pa- 
rameter set in each of the selected regions. However, such 
localisation of the model and of the solution is then the 
result of prior choices. The number of foreground compo- 
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nents is fixed, the regions to be masked, or to be analysed 
separately, are selected a priori. In reality, the number of 
relevant components is not well known before the data is 
analysed, and also varies in practice both over the sky and 
over the scales. For instance, most of the galactic foreground 
components become small, and possibly negligible, at small 
angular scales and at high Galactic latitude. The number of 
relevant components in any data set, however, is bound to 
depend on the level of instrumental noise. 

The total foreground emission can be separated by sub- 
tracting a CMB map estimate from the observation maps. 
This method has been investigated on the WMAP obser- 



vations by Ghosh et al. (20101 and employed so far by the 



Planck collaboration in their early results (Planck Collab- 



oration et al. 2011 1. However, the CMB-free maps suffer 
from excessive noise contamination: as the removed CMB 
map itself is a low noise estimate obtained from a minimum 
variance procedure, most of the instrumental noise ends-up 
in the foreground maps, which must then be re-processed 
(e.g. filtered) after CMB subtraction. The new component 
separation method investigated in this paper is an extension 
of the Internal Linear Combination (ILC) method, aimed 
at the reconstruction of the total foreground emission with 
the intention of both relaxing the prior assumption on the 
number of foreground components, and of performing a lo- 
cal processing for best suppression of the contamination of 
the reconstructed foreground components by residual instru- 
mental noise. 

Classical ILC methods do not assume a particular 
parametrisation of the foreground emission. They offer a 
simple way to extract the map of a single component of 
interest and have been used by several authors in the anal- 
ysis of the maps obtained by the WMAP satellite to ex- 
tract a CMB map (iBennett et al.l|2003||Eriksen et al 



2008; 



2004 



Park, Park fc Gott| |2007| |Kim, Naselsky fc Christensen 



Delabrouille et al., 2009). The traditional ILC, how- 



ever, can only recover components for which the emission 
scales rigidly with frequency (hence its use for separating 
a CMB map). In addition, the ILC performs satisfactorily 
only if the component of interest is not correlated with the 
other emissions. 



In a previous publication (Remazeilles, Delabrouille & 



Cardoso 20111, we have introduced the Constrained ILC, 



which extends the ILC to the case where there is more than 
one component of interest (e.g. CMB and thermal SZ), and 
one wishes to cancel out the contamination from one of them 
into the recovered map of the other. In the present paper, we 
generalise further the ILC and address the blind separation 
of multidimensional components which cannot be modelled 
as one single template scaling with frequency according to 
a single (pixel independent) emission law. 



2 CMB ESTIMATION BY STANDARD ILC 
2.1 Model of the measurement 

In all of the following, we assume that all available maps 
(iV bs maps) can be written, for all pixels p of the observed 
maps, in the form 



where s(p) is the CMB template map, z(p) the thermal 
Sunyaev-Zel'dovich effect, f(p) is the emission of the rest 
of the foregrounds as they would be observed by the in- 
strument in absence of anything else, and n(p) is the in- 
strumental noise. Note that f(p) and n(p) are represented 
with iVobs maps each, while the CMB and the SZ effect are 
represented by one single map each, scaled across frequency 
channels using CMB and SZ scaling coefficients, a and b, 
which are assumed to be known. 

Depending on the objective, any of as(p), bz(p) or f(p) 
can be considered as 'noise' and implicitly included in the 
noise term. Similarly, depending on the objectives of the 
component separation, as(p) or bz(p) can be considered as 
part of the total 'foreground term' i.e. implicitly included 
in tip)- 

2.2 Extraction of the CMB 

The ILC provides the estimate silc of the CMB component 
s by forming a linear combination of the iVobs observed maps 
which has unit response to the component of interest and 
has minimum variance. Straightforward algebra leads to: 



silc 



a* R 

a* R~ 



y 



(3) 



where R is the empirical covariance matrix of the observa- 
tions, a has dimension iV b s x 1, and y is the iVcba x 1 vector of 
observation maps. This standard ILC can be used similarly 
to recover an estimate Silc of the SZ effect (with a replaced 
by b in Eq. Q). Note that the quality of CMB reconstruc- 
tion with an ILC depends on the accuracy with which a 
is known. In presence of errors (for instance calibration er- 
rors), there is no guarantee that the CMB is preserved ( |Dick,| 
Remazeilles & Delabrouille 20101. 



Assuming no correlations between the components, the 
total covariance matrix R of the observations y can be writ- 
ten as: 



R = oa'CcMB + bb Csz + Rfg + Rjv 



(4) 



2.3 Wavelet space ILC 



y(p) = as(p) + bz(p) + f(p) + n(p) 



(2) 



In its simplest implementation, the ILC is performed on the 
complete maps, and one single global matrix R is used for 
the whole data set. This requires all maps to be at the same 
resolution. It is possible, however, to decompose the orig- 
inal maps as sums of different data subsets, covering each 
a different region in pixel space or in harmonic space, to 
apply independent versions of the ILC to the different data 
subsets, and then to recompose a map from all these inde- 
pendent results. 

The main interest of such a decomposition is the possi- 
bility to adapt the ILC filter to local contamination condi- 
tions. Such localisation of the filter is useful in pixel space: 
the Galactic emissions are stronger in the Galactic plane, 
whereas noise dominates the total error at high Galactic lat- 
itudes. It is also useful in harmonic space, because contami- 
nants do not all have the same angular power spectrum and 
because of the channel-dependence of instrumental beams. 

Note however that some care must be taken when sub- 
dividing the original data into small subsets. The ILC in- 
deed relies on the component of interest to be uncorrelated 
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with the contaminants (i.e. (s(p)rii(p)) = 0, for all channels 
of observation i). If this condition does not hold, the ILC 
introduces a bias in the reconstruction. This has a conse- 
quence on the minimum data size on which the ILC should 
be implemented: too few independent data points results 
in empirical correlations between the component of interest 
and the contaminants, which generates a reconstruction bias 



as described in the appendix of Delabrouille et al. (20091. 



In the present paper, observations are decomposed us- 
ing the spherical needlets discussed, in the context of CMB 



data analysis, by several authors (see, e.g., Pietrobon, Balbi 
fc Marinucci| (|2006|>; |Marinucci et al.| ( |2008[ ) ; |Fay et al. 
( |2008| ); |Guilloux, Fay fc Cardoso| ( |2009| >). This needlet de- 
composition provides localisation of the ILC filters both in 
pixel and in harmonic space. 

We define a set of spectral windows h^'(£) such that, 
over the useful range of £, we have: 



(5) 



Maps of wavelet (needlet) coefficients are obtained, for each 
observed map y(p), by inverse spherical harmonic transform 
(SHT) of the associated map SHT coefficients yi m filtered 
by the spectral windows hi : 



7 



U) 



(p) = E Vlm h t Y e™(p) 



(6) 



For each scale j, for each pixel p of the corresponding needlet 
coefficients maps ja (one such map for each observation 
a), the empirical covariance matrix R used in equation ^ is 
computed from an average, in a domain T> p centered at pixel 
p and including some neighbouring pixels, of the product of 
needlets coefficients. The ab entry is given by 



R ab (p) = 



E 



Ta 



U) (p'hl j) (p) 



(7) 



In |Delabrouille et al.| ( |2"009[ ), the practical implementation 
uses, as domains T> p , HEALPix 'super-pixels' obtained by 
grouping 32 x 32 pixels of the needlet coefficient maps 7"' (p) 
(making use of the hierarchical definition of HEALPix pix- 
els). Here, we use a slightly different prescription: we smooth 
instead the product map r y'a\p)^\p) with a symmetric, 
Gaussian window in pixel space. This avoids artificial dis- 
continuities at super-pixel edges. 



3 FOREGROUND ESTIMATION BY 
MULTIDIMENSIONAL ILC 

We now address the problem of estimating the set of maps 
f(p), i.e. a 'catch-all' foreground component comprising 
the emission of the diffuse Galactic interstellar medium 
(ISM), and of numerous Galactic and extragalactic compact 
sources. The objective is to construct estimated maps f(p) 
that 'best match' what would be observed by the instrument 
in the absence of CMB, SZ and noise (see Eq.[2}. 

Astrophysical emission originating from the Galactic 
ISM and from numerous extragalactic sources is qualita- 
tively different from the CMB and the SZ effect. Each of the 
latter is somewhat special in the sense that its emission can 
be modeled, with good accuracy, as a single template scaling 



in a known way with frequency. The total foreground (FG) 
emission / comprises contributions from several different 
processes. In addition, we cannot even assume a priori that 
a linear mixture model (in which each map constituting / 
would be a linear superposition of well defined templates) 
does hold. 

For extracting such emissions from multi-frequency ob- 
servations, we propose to generalize the ILC method to ad- 
dress the case of such a 'multidimensional component'. 



3.1 Multidimensional components 

Let Rfg = (// } denote the covariance matrix of the ob- 
served foregrounds in A^bs frequency channels. This iV b s x 
iV ba matrix Rfg will be refered to as the FG covariance 
matrix. 

Among astrophysical foregrounds included in the 
'catch-all' component /, the ISM of our own galaxy is the 
main contributor. It emits via the combination of several 
processes (synchrotron, free-free, thermal dust, 'anomalous' 
dust emission, molecular lines...). In previous work, some 
of these processes have been individually modelled each by 
a fixed template and an emission law. Bouchct fc Gispert] 
(1999), for instance, assume that synchrotron emission scales 
with frequency proportionally to v~ 9 , free-free proportion- 
ally to is~ 016 , and dust proportionally to u 2 B v (T) with 
T = 18 K. Since the emission of the ISM in each channel 



is described as a linear mixture of three templates, such a 
model predicts that the ISM covariance matrix (which we 
will denote as Rism) is a rank 3 matrix. When the contribu- 
tion of extragalactic compact sources is neglected (assuming 
bright point sources are extracted from the maps, and faint 
ones contribute a negligible amount of emission), the fore- 
ground covariance matrix itself, Rfg, is equal to Rism (the 
covariance matrix of the Galactic ISM emission), and is a 
rank 3 matrix. 

Such a model is too crude in the context of the very sen- 
sitive measurements performed by WMAP and Planck: the 
emission laws of the Galactic emissions vary as a function 
of the direction on the sky. To make things even more com- 
plex, the background of compact sources contributes emis- 
sion that becomes significant for measurements such as those 
of the Planck mission JPlanck Collaboration et al. 20111, 



and that can not be modelled at all as the sum of a few 
independent components. 

The question of the rank of the FG covariance matrix 
is a crucial one for component separation. This matrix is 
expected to be, strictly speaking, full rank. In practice how- 
ever, the issue is slightly more subtle. Consider its eigen- 
decomposition: Rfg = VDV*, where V is an orthonormal 
matrix and D is a diagonal matrix with eigen- values sorted 
in decreasing order. While the three-component model of 



Bouchet & Gispert (19991, predicts that only the first three 



eigen- values are non-zero, a model with spatially varying 
spectral indices, and numerous additional emission processes 
('anomalous' dust emission, molecular lines, extragalactic 
source background) predicts that all the eigen-values are 
non-zero. However, if there is only a small variation of the 
spectral indices, and if some components are very weak, it 
is expected, at least in some regions of the sky or at some 
angular scales, that the smallest eigen-values are very close 



4 Mathieu Remazeilles, Jacques Delabrouille, Jean-Francois Cardoso 



to zero so that Rfg is 'almost rank-deficient' (see section 3.4 
below for a more rigorous statement). 

In this paper we propose, as in Cardoso et al. (2008), to 



model the FG covariance matrix as an -/V b s x iV bs matrix 
of rank TVfg, not necessarily equal to A^bs- Loosely speak- 
ing, TVfg counts the number of different templates needed 
to represent most of the emission of the FG in our data set. 
In other words, we try to capture all foreground emission 
as resulting from A^fg (possibly correlated) templates. The 
integer 7Vfg is called the (effective) FG dimension and may 
vary over the sky with respect to the pixel p, or with re- 
spect to £ in harmonic space, or with respect to the needlets 
domain considered for a decomposition of the maps on a 
needlet frame. 



3.2 The foreground subspace 

The analysis in this paper is performed on a needlet frame. 
The temperature map needlet coefficients are indexed by 
(j, k), where j denote the scale and k the pixelQ 

In a given needlet domain Tjjp , if the FG covariance ma- 
trix Rfg is a (symmetric, non negative) AT bs X Aobs matrix 
of rank -/Vfg, then foreground emission can be represented 
as a superposition of 7Vfg templates: 



/ = Fg 



(8) 



where the A^bs X JVfg matrix F is called the foreground 
mixing matrix and where g is a vector of dimension A^fg • It 
follows that the FG covariance matrix is 



Rfg = (ff) = F(gg*)F < = FGF* 



(9) 



where G is a Afg X Afg full rank covariance matrix. 

Note two important points. First, the templates g are 
not expected to correspond to physical foregrounds. They 
are just a basis of the A^G-dimensional subspace spanned 
by /. We are not interested in recovering g. Our objective 
with the method discussed here is to recover / (in addi- 
tion to s and z). Secondly, matrix F and its number Afg 
of columns may depend on the domain T>^ considered. For 
instance, at high Galactic latitude, it is quite possible that 
our observations contain negligible emission from some of 
the Galactic foregrounds, but not so at low Galactic lat- 
itude. The needlet implementation allows us to modulate 
the effective dimension of the multidimensional foreground 
component both in pixel space and in harmonic space, i.e. 
vary N-pg across the sky regions and the physical scales. 

We also stress that, unlike in the case of CMB and SZ re- 
construction, where the mixing vectors a and b are assumed 
fully known a priori, we do not assume here that the matrix 
F is known. We will not resort to prior physical knowledge 
about the components of the FG emission to determine ma- 
trix F. In fact, as the basis templates g do not correspond to 
anything physically meaningful, we are not even interested 
in determining F itself but, for reconstruction purposes, only 
the product / = Fg. It is only assumed that matrix Rfg has 



1 Note that the methods described throughout the paper do not 
require a needlet frame in particular and can be implemented 
in pixel space as well, where domains T> should be indexed by 
pixels p, or in harmonic space with the domains indexed by {(., m) 
coefficients. 



a given rank A^fg (which can be estimated from the data, if 
needed) in the needlet domain. 

Matrix F cannot be determined from the data only, that 
is, without making use of some prior information or assump- 
tion about g. Indeed, let T be some invertible Afg x A^fg 
matrix and consider the transformed matrices F = FT 1 and 
G = TGT*. These transformed matrices are an alternate, 
completely equivalent, factorization of the FG covariance 
matrix since, by construction, FGF* = FGF . Physically, it 
means that the AVg underlying templates g can be replaced 
by any other linear combination Tg of them (provided the 
linear combination is not degenerate, i.e. T is invertible). 

However, as we shall see in section [373] the implemen- 
tation of the ILC filter for estimating the total FG emission 
does not require the full knowledge of F. Indeed, the expres- 
sion of that filter is strictly unchanged upon the introduction 
of such an invertible factor T. In section [374] we show how 
matrix F can be estimated up to multiplication by a right 
factor T. It is worth stressing again that this indetermina- 
tion means that we are only concerned with estimating the 
column space of matrix F (noted Col(F) throughout the pa- 
per). That A'FG-dimensional space can be called the 'FG 
subspace'. Our working assumption that Rfg has rank A^fg 
means that the FG data has a covariance structure which is 
unknown but is constrained to live in the FG subspace. 

Physically, accepting this indetermination amounts to 
giving up, during the component separation stage discussed 
here, distinction between processes of emission on the ba- 
sis of physical criteria such as emission process or physical 
origin. Obviously, this is not fully satisfactory from an as- 
trophysicist's point of view, since in the end we would like 
to know what is the source of the observed emission. This 
distinction among sources of FG emission, however, can be 
made at a later stage of the data analysis, i.e. we can first 
separate CMB and SZ from other foregrounds, and then put 
physics into the interpretation of the reconstructed multidi- 
mensional FG component and interpret it as the sum of 
emissions from a number of physical emission processes. 

3.3 Multidimensional ILC filter 

Aiming at a direct estimation of the foregrounds, we gener- 
alize the ILC method to address the case of a multidimen- 
sional component (here A^G-dimensional, where Nfg is the 
number of components, i.e. the rank of the foreground co- 
variance matrix). In a given needlet domain, we model the 
observation maps, collected into the A^bs X 1 vector y, as 

y = Ax + n, (10) 

where n is the A^bs X 1 vector of instrumental noise and 



(11) 



Note that no rigid scaling with frequency is assumed since 
all these quantities are needlet-dependent, i.e. they depend 
both on the scale considered and on the pixel. Here the 
(A^fg + 1) x 1 signal vector x contains the CMB emission s 
as first entry and the A'fg X 1 vector g which collects the 
emission of the A^fg components needed to model the total 
foreground emission. The A^bs x {Nfg + 1) mixing matrix 
A contains, as a first column, the N h s X 1 vector a giving 
the frequency scaling of the CMB component. The other 
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columns correspond to the N t, s x A?fg foreground mixing 
matrix F, i.e. they span the foreground subspace. Note that 
this assumes that a itself cannot be obtained by linear com- 
binations of the columns of F (more about this later) . 

As a refinement, it can be useful to single out both 
the CMB and the SZ, in which case the second column in A 
explicitly appears as the frequency scaling vector b of the SZ 
component (and the SZ can be considered as excluded from 
the rest of the foregrounds). We get back to this refinement 
in sections 13.51 and HI 

Eq. ( 10 I assumes that all observations are at the same 



resolution, which is needed to implement the ILC filter (for 
practical implementation, maps are put to the same res- 
olution by partial deconvolution in harmonic space). The 
localisation in harmonic space allows dropping out some of 
the channels at high £ if needed by reason of insufficient 
resolution. 

We consider the estimation of / by a linear operation 



f = By, 



(12) 



where, as in standard (one-dimensional) ILC, the N bsXN b s 
ILC weight matrix B is designed to offer unit gain to the 
foregrounds while minimizing the total variance of the vec- 
tor estimate /. In other words, matrix B is the minimizer of 
E(\ By 2 ) under the constraint BF = F. The weights matrix 
B thus solves the following constrained variance minimiza- 
tion problem 



min Tr (BRB*) 

BF— F v 1 



(13) 



where R is the covariance matrix of the observations y and 
Tr is the matrix trace operator. That problem can be solved 
by introducing a Lagrange multiplier matrix A and the La- 
grangian 



£(B, A) = Tr (BRB 4 ) - Tr (A*(BF - F)) 



(14) 



By differentiating \14\ with respect to B, one finds that 
d£(B, A)/<9B = entails 



2BR = AF 



(15) 



By imposing the constraint BF = F on ( |15[ ), one then finds 
that A = 2F(F'R~ 1 F)~ 1 . Hence, the solution of the problem 
(131 is the foreground ILC weight matrix given by 



B = F (F'R _1 F) 



Vr 



(16) 



Comparing Eq. (16 1 to Eq. pj|, multi-dimensional ILC ap- 
pears as a direct generalization of one-dimensional ILC^j 

One can immediately notice that expression ( |16[ ) for B 
is invariant if F is changed into FT for any invertible matrix 
T. Hence, as already mentioned in Section [3.2| implement- 



ing the foreground ILC filter ( 16 1 only requires that the 
foreground mixing matrix F be known up to right multipli- 
cation by an invertible factor. Again, in other words, the 
only meaningful and mandatory quantity for implementing 
a multi-dimensional ILC is the column space of F. 



2 The ILC estimate of the CMB vector of emission in 
each frequency channel is obtained by applying the filter 
a (a'R _1 a) a'R -1 (i.e., the filter of eq. n3b multiplied on the 
left by the vector a). 



3.4 Estimation of the foreground subspace 

In this section, we propose a method for estimating the fore- 
ground subspace locally, that is, in each needlet domain. 
We consider only the case where the model accounts for the 
CMB, an Nfg -dimensional foreground component and noise 
at a known level: 



R = CcMBaa + FGF' + R r 



(17) 



and we want to estimate the foreground subspace Col(F) 
from an estimate R of R. Define the A^bs x (ATfg + 1) matrix: 



L = aC. 



1/2 
CMB 



FG 



1/2 



(18) 



1/2 

where the first column oC CMB of L, containing the CMB fre- 
quency scaling vector (which is known) is distinguished from 
the A^fg unknown remaining columns FG 1 / 2 associated to 
the foregrounds. So the signal part of the covariance matrix 
is LL': 

R = LL* + Rjv 

Our procedure for estimating the column space Col(F) of the 
foreground mixing matrix F is in two steps. In a first step, 
we obtain an estimate of L up to right multiplication by a 
rotation matrix (and an estimate for the dimension A^fg of 
the foreground subspace) using the knowledge of the noise 
covariance matrix. In a second step, we use the fact that 
the first column of L is known (up to scale) to obtain an 
estimate of Col(F). That is described next. 

Denote the eigenvalue decomposition of the noise- 



whitened signal covariance matrix R 



-1/2 



LL R 



-1/2 



R-^LL'R- 1 ' 2 



N KK AT 



1/2 



UAU*. 

where U is orthonormal: UU* = I, and A is diagonal. Now, 
-R- 1/2 (LL' + R N )R JV 1/2 
= R Jv 1 / 2 LL*R Jv 1 / 2 + l 
= UAU* + UU* 
= U[A + I]U', 



-1/2 



N and R N 



-1/2 



LL t R JV L//2 share the 



showing that R]^V 2 RR 
same eigen-vectors but that the former has its eigenvalues 
shifted by 1 with respect to the latter. Further, if L has rank 
Nfg + 1 then its eigen-structure actually is 



UAU* = [U S U„] 



As 





[U s U r 



(19) 



where U s has (ATfg + 1) columns, U n has A^bs — (ATfg + 1) 
columns, and As is a (ATfg + 1) x (Nfg + 1) diagonal matrix. 

3.4-1 Estimation o} Nfg and L 

In the needlet domain considered, given an estimate R of R, 
we compute the eigen-decomposition: 



R~ 1/2 pp 



1/2 



UDU 



(20) 



and, similar to eq. (19 1, we denote by Ds the sub-block of D 
corresponding to the eigenvalues that are larger than (1 + e) 
and by Us the corresponding subset of columns of U. Here 
e is a threshold above which the observation is not domi- 
nated by instrumental noise (see section [4] for the choice of 
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the threshold). This threshold condition thus provides an 
estimate for the dimension Afg of the foreground subspace 
in the needlet domain given the dimension (Afg + 1) of the 
sub-block Ds which fulfills the threshold condition. 

In this preprocessing, the dimension of the estimated 
signal subspace, (JVfg + 1), depends on the level of noise 
in the needlet domain considered, so the signal subspace is 
estimated locally, both in space and in scale. This processing 
thus locally performs a rank reduction of the observations 
covariance matrix allowing the reduction of the instrumental 
noise in the reconstruction. 

By construction, the matrix 

M = R^ 2 Us(d s -|V /2 (21) 



• Compute the AFG-dimensional ILC filter 



is such that R 



-1/2 



MM R 



-1/2 



is close to R 



-1/2 



LL'R 



-1/2 



if 



\ N .V,,», lO UW ,\ N 1.1. ,\ N 

R is close to R. That property, in turn, implies that MO 
is close to L for some (undetermined) rotation matrix O, 
completing the first step of our estimation procedure. 



3-4-2 Estimation of the foreground column space Col(F) 

In the second step, we note that the rotation matrix O 
should be such that MO is close to L. However, only the 
first column of L is known, up to scale. Hence, we partition 
O as O = [ v | V] where v is a unit norm vector and V is an 
(Afg + 1) x TVfg matrix. The only available constraint is 
thus that Mv should be close to the first column of L. How- 
ever, we cannot expect to find a v such that Mv is strictly 

1/2 ~~ 

equal to aC CMB because M is estimated from the data so 
that Col(M) does not necessarily contain a (as would be the 
case for R = R). The best we can do is to determine v such 
that Mv is equal to the projection of aC^ B onto Col(M). 

The orthogonal projection matrix is M(M M) M so that 
the projection is 



M M M 



M aC, 



1/2 



Let us then denote by a the vector 

a = (m'm) 1 M a. 



(22) 



The projection of aC^ B onto Col(M) then is MaC^j 

1/2 

and vector v is therefore given by v — a C CMB . Recall that 
v is a unit norm vector so we must have: 



v — a/\a\ and Ccmb = l/|<z|" 



(23) 



Once vector v is determined, the constraint that O = [ v | V] 
is a rotation matrix uniquely determines V up to right mul- 
tiplication by a rotation factor. However, nothing more is 
required to determine the foreground subspace, as already 
stressed. Our procedure is therefore complete and can be 
summarized by the following steps: 



Compute the eigen-decomposition ( 20 \ of the noise 



whitened covariance matrix. Obtain an estimate of Nfg 
from comparing the level of the eige nvalues to the noise level. 
• Form matrix M by eq. (21 1, compute vector a by 



eq. ( |22p and get v by normalization ( |23[ ). 

• Compute an (JVfg + 1) x -Afg matrix V such that matrix 
[v | V] is a rotation. 

• Obtain a basis of the foreground subspace as F = M V. 



B 



fYf'r 1 ? 



F R 



3.5 



Projecting foregrounds orthogonally to both 
thermal SZ and CMB 



The foreground multidimensional ILC filter can be gener- 
alised further. Thermal SZ emission can be singled out in 
the same way as the CMB, in which case we may require 
that there is no thermal SZ residual in the reconstructed 
foreground map. This is doable because the emission law of 
the SZ component, like that of the CMB, is known. We then 
write the model of emissions as: 



b F 



(24) 



where we have explicitly distinguished the thermal SZ emis- 
sion z from the other foregrounds through its frequency scal- 
ing vector b (emission law). We may then generalize the 
processing developed in Sec. |3.4| In that spirit, the FG mix- 
ing matrix F can then be estimated in the needlet domain 
considered from the set of Afg columns orthogonal to both 
the projection of the CMB scaling vector and the projec- 
tion of the thermal SZ scaling vector onto the estimated 
(A r FG+2)-dimensional signal subspace. This guarantees that 
the foreground map reconstructed by the multidimensional 
ILC now contains neither SZ nor CMB (with, however, the 
usual caveat that the statistics used to compute the covari- 
ance matrices must be accurate enough). In addition, the 
rank-reduction procedure (restriction of the observations to 
the (Afg + 2)-dimensional signal subspace) in each needlet 
region reduces the level of instrumental noise locally in the 
reconstructed foregrounds. 

3.6 Discussion of special cases 

3. 6. 1 Less channels than foreground dimension 

In the discussion above, we assumed that the signal sub- 
space is the direct sum of two subspaces: the CMB subspace 
which is one-dimensional (because of the rigid scaling of the 
CMB with frequency) and the foreground subspace which is 
AFG-dimensional. The former is not included in the latter 
if no combination of foreground emission has the same scal- 
ing as the CMB across available frequencies. Of course, this 
property requires enough properly chosen frequency chan- 
nels. 

When there are more components than observations, 
then unless the foreground emissions are either fully 
correlated or very faint (below noise), then we have 
Afg + 1 > A^bs and the CMB cannot be perfectly sepa- 
rated from the foregrounds. When there are enough inde- 
pendent observations {i.e. a large number of channels), then 
A'fg < Aobs, and in general the CMB subspace is not con- 
tained in the (larger) foreground subspace. Separation is 
then possible up to finite-sample size errors in the deter- 
mination of the appropriate subspaces. 

Note in the passing that the kinetic SZ cannot be sep- 
arated from the CMB on the basis of its emission law. 
Throughout this paper the CMB, distinguished solely by its 
known emission law a, also includes the kinetic SZ effect. 
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Observation 70GHz 



Input Foreground 70GHz 



Input Thermal SZ 




4.0e-05 y SZ 



Figure 1. Simulated Planck observations. A 12.5° X 12.5° patch of the simulated sky located at low Galactic latitude, around 
Galactic coordinates of (I, b) = (72°, —8°). From left to right: observed map, foreground map, and thermal SZ map at 70 GHz. All maps 
are at the resolution of the 70 GHz channel (14 arc-minutes). 




1000 1500 
Multipole I 



Figure 2. The spectral bands used in this work for the definition 
of the needlets. 



.6.2 CMB subtraction as an (N h s — 1) -dimensional ILC 



A foreground estimation has been obtained in Ghosh et al. 
(20101 by subtracting the CMB-ILC estimate from the ob- 



servations data, / = y — all. It is interesting to note that the 
CMB subtraction procedure is equivalent to an (A^bs — 1)— 
dimensional ILC filtering, i.e. the particular multidimen- 
sional ILC filtering where the dimension of the foreground 
subspace is assumed to be constant over the whole sky and 
the whole range of scales, and equal to (A b s — 1)- Indeed, 
the CMB subtracted estimate expands as follows 

a*R 



f = y 



w 



V 



a* R 



W _1 (l 



I — Wa ((Wo)' Wo) (Wa)'jWy, 

Pi)Wy, (25) 

where W = R _1//2 denotes the inverse square root of the 
data covariance matrix. 

Matrix Pi=Wa((Wa)'Wo) _1 (Wa)' is an orthog- 
onal projection (Pi = Pi and P^ = Pi) onto the line 



Span (Wa) (one-dimensional 'whitened' CMB subspace). 
It implies that I — Pi = P-h is the projection onto 
the (A^bs — l)-dimensional hyperplane H = [Span (Wa)] x , 
which is orthogonal and complementary to the one- 
dimensional whitened CMB subspace. Let us denote 
v = Wa/|Wa| and consider an Aobs X (AT bs — 1) matrix V 



such that [v j V] is a rotation. Then P-h 
by denoting F = W _1 V, we get 



V(V'V) V'and, 



= W X V (V*V) 1 V*Wy, 
= F(F t R- 1 F) _1 F t R- 1 y, 



(26) 



which completes the proof since F is full rank (A^bs — 1). 
Here, it is interesting to notice that the (A^bs — 1)- 
dimensional ILC can be obtained without even knowing the 
mixing matrix F since the procedure becomes equivalent to 
the estimation obtained by subtracting the CMB-ILC esti- 
mate from the observations data. 

This equivalence means that the CMB subtraction 
procedure does not take advantage of the fact that the 
foreground mixing matrix can be almost rank- deficient in 
some regions of the sky or at some scales (for instance at 
small scale where the instrumental noise is dominant). The 
(Nobs — l)-dimensional subspace Col(F) reconstructed here 
thus includes both noise and foregrounds components. Con- 
sequently, such a foreground reconstruction is noisy. The 
AFG-dimensional ILC procedure described in section |3.4| 
performs a cleaner foreground reconstruction (in terms of 
signal to noise ratio) because the effective rank Afg of 
the foreground subspace and the foreground mixing matrix 
(with reduced rank) are estimated locally in each needlet 
domain. In effect, this boils down to performing at the same 
time both component separation, and denoising by thresh- 
olding the needlet coefficients. 
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(Nobs-l)-dimensional FG ILC (no CMB) 



N_FG-dimensional FG ILC (no CMB) 



NJG-dimensional FG ILC (no CMB, no SZ) 




(7a.il, -B.a) aoinct. 




(7a.u, -B.a) <m*i 



(7a.fl, -B.a) aoinct. 



(Nobs— 1) Reconstruction error (no CMB) 



N^G Reconstruction error (no CMB) N_FG Reconstruction error (no CMB, no SZ) 
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Figure 3. 70GHz foreground multidimensional ILC reconstruction at low Galactic latitude around (I, b) = (72°,— 8°). 

Top: CMB-orthogonal (-/V t, s — l)-dimcnsional ILC map, CMB-orthogonal jVpG-dimcnsional ILC map, (CMB+SZ)-orthogonal N-pQ- 
dimensional ILC map. Bottom: error (difference input-output) maps respectively for CMB-orthogonal (-/V v, s — l)-dimensional ILC, 
CMB-orthogonal ./Vpc-dimensional ILC, and (CMB+SZ)-orthogonal Nyg~ dimensional ILC. 



4 PLANCK SIMULATIONS AND RESULTS 

We now turn to illustrating this discussion with examples 
based on simulated data sets. We apply our multidimen- 
sional FG ILC filter on a frame of needlets. The spectral 
bands used in the definition of the needlets are shown in fig- 
ure]^] For each needlet domain considered, we both project 
the data onto the 'full rank' foreground subspace (equivalent 
to simple CMB-ILC subtraction), and onto a 'reduced rank' 
foreground subspace. For the latter, we reject the eigenval- 
ues of the covariance matrix of the observation needlet co- 
efficients smaller than 1.25 times the noise covariance level, 
i.e. values for which the instrumental noise contributes more 
than 80% of the total emission. This is a somewhat arbi- 
trary, but reasonable choice, chosen here for illustration. In 
practice, this threshold can be fixed more rigorously, con- 
sidering the trade-off between rejecting low-level foreground 
emission, and letting in the final map too much instrumental 
noise. 

As a refinement the number of relevant foreground com- 
ponents could be estimated without even imposing any ar- 



bitrary threshold, e.g. by using the Akaike Information Cri- 
terium (Akaike} |1974[ ). This criterium consists in maximizing 
the likelihood of the observations given the model, taking 
into account a particular penalty imposed on the number 
of free parameters entering in the model (e.g. the number 
of foreground components, or equivalently the rank of the 
foreground mixing matrix). 



Our investigations are carried out on sky temperature 
simulations generated with the Planck Sky Model (PSM) 
version 1.6.6. Sky simulations include Gaussian CMB gen- 
erated assuming a theoretical angular spectrum fitting the 
WMAP observations, thermal and kinetic SZ effect, four 
components of Galactic ISM emission including thermal and 
spinning dust, synchrotron, and free-free, and emission from 
point sources (radio and infrared). The resolution and noise 
level of the observations correspond to nominal mission pa- 
rameters as described in the Planck "Blue Book" . Nine fre- 
quency channels are used in this simulation and correspond 
to the Planck HFI and LFI channels. Details about PSM 
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Recovered FG at 70GHz: (Nobs-l)-dimensional ILC Recovered FG at 70GHz: N_FG-dimensional ILC 




-0.40 0.40 niE -0.40 — 0.40 ml 



Reconstruction Error: (Nobs- 1)— dimensional ILC Reconstruction Error: N_FG— dimensional ILC 




Figure 4. Foregrounds reconstruction at 70 GHz. Left panels: (iV D b s — l)-dimensional needlet ILC output map and needlet error 
(difference input-output) map. Right panels: A r FG~ a -i mens i° na l needlet ILC output map and needlet error (difference input-output) map. 
The NpQ— dimensional needlet ILC guarantees the reduction of the noise contamination. 



Effective number of FG components at a scale 512 < 1 < 1100 Effective number °f re components (d12<kuoo) 




. : : II 7,0 3.0 — — 5.0 



Figure 6. Left: Effective number NpQ of foreground components (effective rank of the foreground covariance matrix) at a scale 512 < 
£ < 1100 estimated in each needlet domain from the 80% noise threshold. Components contributing less than 20% of the total observation 
are thus neglected in this analysis (Note that at this scale the original number of useful channels is eight instead of nine because the 
30GHz channel does not have enough resolution). Right: same at low Galactic latitude around (l,b) = (72°,— 8°). 



simulations can be found in |Leach et al.| ( [2008[ ) and |Betoule| 



et al. (2009) 



Figure [y shows the 'observed' 70 GHz map, the input 
foreground map at 70 GHz, and the thermal SZ map, all at 
the resolution of the 70 GHz channel. Our maps are centred 
on an interesting region which is both at low Galactic lati- 



tude, around (l,b) = (72°, —8°) and close to a set of bright 
galaxy clusters. The 70 GHz reconstructed foregrounds, re- 
covered by multidimensional ILC filtering, are shown in the 
same region of the sky on the top panels of figure [3] The 
corresponding reconstruction error (difference between re- 
constructed output and original input) is displayed on the 
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Figure 5. Power spectrum of the recovered foregrounds 
at 70 GHz and of the ILC reconstruction error: (iV b s — 1)— 
dimensional foreground ILC (solid red) and error (dashed red), 
AfpQ-dimensional foreground ILC (solid blue) and error (dashed 
blue) . We clearly see the suppression of the noise at high I for the 
A^fq— dimensional foreground ILC reconstruction. 



bottom row of the same figure. The (N b a — l)-dimensional 
ILC reconstruction (left panels), equivalent to a simple sub- 
traction of the ILC estimate of the CMB map to the ob- 
servation map, is clearly noisy (as expected from the dis- 
cussion of section [3.6.2[ ). The reduction of the noise in the 
foreground reconstruction is achieved by performing a Afg - 
dimensional ILC reconstruction (middle panels), where JVfg 
is the local dimension of the FG subspace depending both on 
the needlet scale and on the pixel. We observe the leakage of 
a thermal SZ emission in the FG reconstruction on the left 
and middle panels of figure [3] Using the modified 'reduced 
rank' ILC introduced in section |3.5[ we obtain instead the 
reconstruction displayed on the right panels of figure [3] with 
no visible contamination by SZ emission. 

For completeness the same results are shown on full 
sky maps in figure [4] and we have plotted the corresponding 
power spectra in figure [5] The suppression of the noise con- 
tamination is clearly visible on the spectrum at high t when 
a A^FG-dimensional ILC method is employed. 

Figure [6] shows, for the fifth needlet band (scales com- 
prised between I = 512 and I = 1100), the effective number 
Nyg of foreground components (i.e. the effective rank of the 
foreground covariance matrix) which has been estimated in 
each needlet domain from the 80% noise threshold (fore- 
ground components contributing less than 20% of the total 
emission in the needlet domain have been neglected). On 
the right panel of the figure, a 12.5° x 12.5° patch of sky 
at low Galactic latitude, centred around (/,&) = (72°,— 8°), 
explicitely shows the effective number of foreground compo- 
nents estimated in this region. This number decreases ac- 
cording to the distance to the Galactic plane. This is consis- 
tent with the bottom right panel of figure [3] showing that the 
residual noise is locally distributed and decreases according 
to the distance to the Galactic plane. 



5 CONCLUSION 

In this article, we have shown how the standard ILC proce- 
dure, originally dedicated to the CMB extraction, can be 



extended for the reconstruction of complex astrophysical 
emissions, beyond the CMB alone. We have developed gener- 
alised ILC filters (multidimensional ILC) to reconstruct the 
diffuse emission of a complex multidimensional component 
originating from multiple correlated emissions, such as the 
total Galactic foreground emission. Similar, though pixel- 
based extensions have been also implemented in a fastlCA- 
based code, AltICA, as used inlLeach et al. (2008) and are in- 
tegrated in the Planck LFI Data Processing Center pipeline 
(C. Baccigalupi, R.Stompor, private communication). Our 
estimators were implemented on a needlet frame and tested 
on simulations of Planck observations. This new ILC filter- 
ing successfully reconstructs the foreground emission, ex- 
empt from both the CMB and the SZ emission, and with a 
reduced level of instrumental noise. 
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